Skip to content

OCPBUGS-59734: fix(azure): resolve credential caching issues around UAMI support #1238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 14, 2025

Conversation

bryan-cox
Copy link
Member

@bryan-cox bryan-cox commented Aug 1, 2025

Summary

This PR fixes credential caching issues in Azure storage operations and adds caching support for User Assigned Managed Identity (UAMI) credentials.

Caching at the driver level was not enough so a global cache was introduced so that we are not getting new credentials over and over from Azure.

Changes

  • fix(azure): fix credential caching key mismatch in driver storageAccountsClient
    Resolves credential caching key inconsistencies in the storage accounts client

  • fix(azure): fix credential caching key mismatch in azureclient
    Fixes credential caching key mismatches in the Azure client implementation

  • feat(azure): add ensureUAMICredentials function with comprehensive tests

    • Adds new ensureUAMICredentials function to obtain and cache Azure TokenCredential using User Assigned Managed Identity (UAMI)
    • Function loads credentials from global cache or creates new ones with proper environment configuration
    • Comprehensive unit tests covering environment variable handling, cache behavior, and error scenarios
    • Tests follow table-driven pattern and use t.Setenv for proper environment handling
    • Updates Azure client instantiation to use UAMI credentials when available

Testing

  • Added comprehensive unit tests for the new ensureUAMICredentials function
  • Tests follow established codebase patterns and conventions

Impact

  • Improves credential management reliability in Azure storage operations
  • Enables proper UAMI support for managed Azure environments
  • No breaking changes to existing functionality

@sjenning
Copy link
Contributor

sjenning commented Aug 1, 2025

/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Aug 1, 2025
@bryan-cox bryan-cox marked this pull request as draft August 1, 2025 17:36
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2025
@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

1 similar comment
@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

@bryan-cox bryan-cox changed the title fix(azure): fix credential caching key mismatch in UserAssignedIdentityCredentials OCPBUGS-60103: fix(azure): fix credential caching key mismatch in UserAssignedIdentityCredentials Aug 4, 2025
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 4, 2025
@openshift-ci-robot
Copy link
Contributor

@bryan-cox: This pull request references Jira Issue OCPBUGS-60103, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Use consistent azureCredentialsKey for both storing and loading cached credentials instead of mixing azureCredentialsKey and userAssignedIdentityCredentialsFilePath. This prevents repeated credential recreation and eliminates redundant log messages.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

…untsClient

Similar to the azureclient fix, use consistent azureCredentialsKey for both
storing and loading cached credentials instead of mixing azureCredentialsKey
and userAssignedIdentityCredentialsFilePath. This prevents repeated credential
recreation in the Azure driver's storageAccountsClient method.
Use consistent azureCredentialsKey for both storing and loading cached
credentials instead of mixing azureCredentialsKey and
userAssignedIdentityCredentialsFilePath. This prevents repeated credential
recreation in the Azure driver's storageAccountsClient method.

Signed-off-by: Bryan Cox <[email protected]>
@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

@bryan-cox
Copy link
Member Author

/test hypershift-e2e-aks

@flavianmissi
Copy link
Member

Hey @bryan-cox! Can you elaborate on your decision of caching the entire storage driver, vs for example caching the credentials within the driver?

@bryan-cox
Copy link
Member Author

Hey @bryan-cox! Can you elaborate on your decision of caching the entire storage driver, vs for example caching the credentials within the driver?

@flavianmissi it's still in WIP so I wouldn't consider what is here to be the final solution.

@flavianmissi
Copy link
Member

Understood, I'll hold my horses 🤠

@bryan-cox bryan-cox changed the title OCPBUGS-60103: fix(azure): fix credential caching key mismatch in UserAssignedIdentityCredentials OCPBUGS-59734: fix(azure): fix credential caching key mismatch in UserAssignedIdentityCredentials Aug 12, 2025
@openshift-ci-robot
Copy link
Contributor

@bryan-cox: This pull request references Jira Issue OCPBUGS-59734, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Use consistent azureCredentialsKey for both storing and loading cached credentials instead of mixing azureCredentialsKey and userAssignedIdentityCredentialsFilePath. This prevents repeated credential recreation and eliminates redundant log messages.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

@bryan-cox: This pull request references Jira Issue OCPBUGS-59734, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

@bryan-cox: This pull request references Jira Issue OCPBUGS-59734, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

Summary

This PR fixes credential caching issues in Azure storage operations and adds caching support for User Assigned Managed Identity (UAMI) credentials.

Caching at the driver level was not enough so a global cache was introduced so that we are not getting new credentials over and over from Azure.

Changes

  • fix(azure): fix credential caching key mismatch in driver storageAccountsClient
    Resolves credential caching key inconsistencies in the storage accounts client

  • fix(azure): fix credential caching key mismatch in azureclient
    Fixes credential caching key mismatches in the Azure client implementation

  • feat(azure): add ensureUAMICredentials function with comprehensive tests

  • Adds new ensureUAMICredentials function to obtain and cache Azure TokenCredential using User Assigned Managed Identity (UAMI)

  • Function loads credentials from global cache or creates new ones with proper environment configuration

  • Comprehensive unit tests covering environment variable handling, cache behavior, and error scenarios

  • Tests follow table-driven pattern and use t.Setenv for proper environment handling

  • Updates Azure client instantiation to use UAMI credentials when available

Testing

  • Added comprehensive unit tests for the new ensureUAMICredentials function
  • Tests follow established codebase patterns and conventions

Impact

  • Improves credential management reliability in Azure storage operations
  • Enables proper UAMI support for managed Azure environments
  • No breaking changes to existing functionality

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Member Author

@flavianmissi this is ready for review now 😄

@bryan-cox
Copy link
Member Author

The logs show the credentials being stored once then loaded from the cache then on

'''
I0813 00:12:11.956008 1 azure.go:1376] Storing UAMI credentials to global cache
...

I0813 00:21:04.989721 1 azure.go:1348] Loaded UAMI credentials from cache
I0813 00:21:05.603952 1 azure.go:1348] Loaded UAMI credentials from cache
I0813 00:21:26.034294 1 azure.go:1348] Loaded UAMI credentials from cache
I0813 00:21:26.333995 1 azure.go:1348] Loaded UAMI credentials from cache
'''

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_cluster-image-registry-operator/1238/pull-ci-openshift-cluster-image-registry-operator-main-hypershift-e2e-aks/1955410821643243520/artifacts/hypershift-e2e-aks/hypershift-azure-run-e2e/artifacts/TestCreateCluster/namespaces/e2e-clusters-xhkq8-create-cluster-vb2c6/core/pods/logs/cluster-image-registry-operator-f599f47bd-qbwvt-cluster-image-registry-operator.log

@bryan-cox
Copy link
Member Author

/test e2e-aws-operator

- Add ensureUAMICredentials function to obtain and cache Azure TokenCredential using User Assigned Managed Identity (UAMI)
- Function loads credentials from global cache or creates new ones with proper environment configuration
- Add comprehensive unit tests covering environment variable handling, cache behavior, and error scenarios
- Tests use table-driven pattern following codebase conventions and t.Setenv for proper environment handling
- Update Azure client instantiation to use UAMI credentials when available
@flavianmissi
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2025
Copy link
Contributor

openshift-ci bot commented Aug 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox, flavianmissi, sjenning

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 13, 2025
@xenolinux
Copy link

/label docs-approved

@openshift-ci openshift-ci bot added the docs-approved Signifies that Docs has signed off on this PR label Aug 13, 2025
@sferich888
Copy link
Contributor

/label px-approved

@openshift-ci openshift-ci bot added the px-approved Signifies that Product Support has signed off on this PR label Aug 13, 2025
@bryan-cox
Copy link
Member Author

/test e2e-hypershift-conformance

Copy link
Contributor

openshift-ci bot commented Aug 13, 2025

@bryan-cox: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-operator 6d90565 link false /test e2e-azure-operator
ci/prow/okd-scos-e2e-aws-ovn 6d90565 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure-ovn 6d90565 link false /test e2e-azure-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@xiuwang
Copy link

xiuwang commented Aug 14, 2025

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Aug 14, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit b04c7c1 into openshift:main Aug 14, 2025
13 of 16 checks passed
@openshift-ci-robot
Copy link
Contributor

@bryan-cox: Jira Issue OCPBUGS-59734: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-59734 has been moved to the MODIFIED state.

In response to this:

Summary

This PR fixes credential caching issues in Azure storage operations and adds caching support for User Assigned Managed Identity (UAMI) credentials.

Caching at the driver level was not enough so a global cache was introduced so that we are not getting new credentials over and over from Azure.

Changes

  • fix(azure): fix credential caching key mismatch in driver storageAccountsClient
    Resolves credential caching key inconsistencies in the storage accounts client

  • fix(azure): fix credential caching key mismatch in azureclient
    Fixes credential caching key mismatches in the Azure client implementation

  • feat(azure): add ensureUAMICredentials function with comprehensive tests

  • Adds new ensureUAMICredentials function to obtain and cache Azure TokenCredential using User Assigned Managed Identity (UAMI)

  • Function loads credentials from global cache or creates new ones with proper environment configuration

  • Comprehensive unit tests covering environment variable handling, cache behavior, and error scenarios

  • Tests follow table-driven pattern and use t.Setenv for proper environment handling

  • Updates Azure client instantiation to use UAMI credentials when available

Testing

  • Added comprehensive unit tests for the new ensureUAMICredentials function
  • Tests follow established codebase patterns and conventions

Impact

  • Improves credential management reliability in Azure storage operations
  • Enables proper UAMI support for managed Azure environments
  • No breaking changes to existing functionality

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Member Author

/jira backport release-4.19

@openshift-ci-robot
Copy link
Contributor

@bryan-cox: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.19

In response to this:

/jira backport release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #1243

In response to this:

@bryan-cox: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.19

In response to this:

/jira backport release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-image-registry-operator
This PR has been included in build ose-cluster-image-registry-operator-container-v4.20.0-202508141315.p0.gb04c7c1.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants